Take Home Exercise 2

Take Home Exercise 2: Focusing on Airbnb and how their expansion has impacted our economy. Using Spatial Point Patterns Analysis of Airbnb Listing in Singapore.

Sarah Chin linkedin.com/in/sarahchin99/
09-14-2021

1. Overview

Airbnb has expanded their services over 34,000 cities across 191 countries. However, Singapore is still one of the global cities that has yet to legalise short-term rentals offered by platforms such as Airbnb. Despite Singapore’s disregard of using Airbnb, there are still tools and datasets about Singapore that allows people to explore how Airbnb are used in the cities.

2. Installing and Loading the packages

packages = c('maptools', 'sf', 'raster','spatstat', 'tmap', 'onemapsgapi')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}

3. Section A: Airbnb Distribution in 2019

In this section, we need to investigate if the distribution of Airbnb listings are affected by location factors such as near to existing hotels, MRT services and tourist attractions.

Before we can analyse these points, we need to import and clean our data. Firstly, we import the Airbnb data using st_read() of sf package and transform the coordinate system to 3414.

airbnb <- read.csv("Airbnb_listing_30062019/30062019.csv")

We also want to extract the number and locations of hotels and tourist attractions in Singapore to see how this competition affects the Airbnb listings.

hotels <- read.csv("OneMap_Data/hotels.csv")
tourism <- read.csv("OneMap_Data/tourism.csv")

Since all the datasets that have been imported are in .csv format, we would need to convert them to sf for further analysis. Additionally, we need to change the coordinate system to 3414, the coordinate system of Singapore. As all of the data provided for latitude and longitude are in decimal degree format, we will assume that the data is in wgs84 Geographic Coordinate System.

airbnb_sf <- st_as_sf(airbnb, 
                       coords = c("longitude", "latitude"),
                       crs=4326) %>%
  st_transform(crs = 3414)

hotels_sf <- st_as_sf(hotels, 
                       coords = c("Lng", "Lat"),
                       crs=4326) %>%
  st_transform(crs = 3414)

tourism_sf <- st_as_sf(tourism, 
                       coords = c("Lng", "Lat"),
                       crs=4326) %>%
  st_transform(crs = 3414)

Let’s plot to review the datasets that have been provided. This is the Airbnb map using airbnb_sf.

tmap_mode("view")
tm_shape(airbnb_sf) + 
  tm_dots(alpha = 0.4, 
          col = "blue", 
          size = 0.05)

Here is the hotels map using hotels_sf.

tm_shape(hotels_sf) + 
  tm_dots(alpha = 0.4, 
          col = "red", 
          size = 0.05)

Here are the tourist attractions available in Singapore, using tourism_sf.

tm_shape(tourism_sf) + 
  tm_dots(alpha = 0.4, 
          col = "purple", 
          size = 0.05)

As we can see from the above results for tourism_sf, there is a coordinate that is not within Singapore. This means that this point (Longitude and Latitude) could possibly be N/A. We can verify this by searching for any missing values.

sum(is.na(tourism_sf$LATITUDE))
[1] 1

From the results, we can tell that there is one N/A result in the column “LATITUDE” under the tourism_sf dataset. We shall remove that N/A value to concentrate our findings on Singapore.

tourism_sf <- tourism_sf[!is.na(tourism_sf$LATITUDE),]

After the N/A row has been removed, we can plot the graph again to see if there’s an improvement.

tm_shape(tourism_sf) + 
  tm_dots(alpha = 0.4, 
          col = "purple", 
          size = 0.05)

After cleaning the tourism_sf dataset, we can finally put the 3 datasets together to see if there are any correlation between the datasets. The Airbnb dataset are highlighted in blue, the hotels dataset are highlighted in red and the tourism dataset highlighted in purple.

tmap_mode("view")
tm_shape(airbnb_sf) + 
  tm_dots(alpha = 0.4, 
          col = "blue", 
          size = 0.05) +
tm_shape(hotels_sf) + 
  tm_dots(alpha = 0.4, 
          col = "red", 
          size = 0.05) +
tm_shape(tourism_sf) + 
  tm_dots(alpha = 0.4, 
          col = "purple", 
          size = 0.05)

From the above plotted map, we can tell that the Airbnb facilities have been spread widely over Singapore, covering places that even the hotels are not available in. On the other hand, majority of the hotels are located in the central district of Singapore with the exception of some hotels such as RM Hotel on the far west and Changi hotels in the east. However, the location of the hotels can be related to the tourism locations. As seen above, the locations of most of the tourist attractions are within the central district of Singapore as well. In order to capitalise and profit from tourists, hotels would locate themselves nearer to the tourist attractions as tourists would prefer to be nearer to these attractions.

Now that we have plotted our graph, we can start the geospatial data wrangling process.

Geospatial Data Wrangling

One of the objectives in this task is to derive the kernel density maps of the Airbnb listings, hotels, MRT services and tourist attractions. In order to analyse any of the data that we have plotted so far, we would need to further clean the data with the following steps.

Step 1: Converting sf data frames to sp’s Spatial class

As the airbnb_sf, hotels_sf and tourism_sf are all in sf data frame, we would need to first convert them into Spatial class.

airbnb_spatial <- as_Spatial(airbnb_sf)
hotels_spatial <- as_Spatial(hotels_sf)
tourism_spatial <- as_Spatial(tourism_sf)
class       : SpatialPointsDataFrame 
features    : 8293 
extent      : 7215.566, 44098.31, 25166.35, 49226.35  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 14
names       :       id,                                              name,   host_id, host_name, neighbourhood_group, neighbourhood,       room_type, price, minimum_nights, number_of_reviews, last_review, reviews_per_month, calculated_host_listings_count, availability_365 
min values  :    49091,                                                  ,     23666,          ,      Central Region,    Ang Mo Kio, Entire home/apt,     0,              1,                 0,            ,              0.01,                              1,                0 
max values  : 36053005, ZR2- NEW! Sunny & Modern Apt 4 mins to Orchard Rd, 271165196,    Zuzana,         West Region,        Yishun,     Shared room, 13999,           1000,               308,  2019-06-25,             12.09,                            277,              365 
class       : SpatialPointsDataFrame 
features    : 422 
extent      : 5939.241, 45334.18, 25379.44, 44562.4  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 7
names       :                              NAME, ADDRESSPOSTALCODE,                                 ADDRESSSTREETNAME,         HYPERLINK, TOTALROOMS,     KEEPERNAME, ICON_NAME 
min values  :                      30 BENCOOLEN,             18956,                                 1 Bayfront Avenue, 96ytlim@gmail.com,          4,  Adel Aramouni, hotel.gif 
max values  : YotelAir Singapore Changi Airport,            819666, 99 IRRAWADDY ROAD, # 22-00 ROYAL SQUARE AT NOVENA, zubair@dam.com.sg,       2561, Zhang YuanQing, hotel.gif 
class       : SpatialPointsDataFrame 
features    : 106 
extent      : 11380.23, 43659.54, 22869.34, 47596.73  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 15
names       :                                  NAME,                                                                                                                                                   DESCRIPTION,                  ADDRESSSTREETNAME,                            HYPERLINK,                                                                                                                             PHOTOURL,                                                                                          URL_PATH,                                                                                                          IMAGE_ALT_TEXT,                PHOTOCREDITS,                  LASTMODIFIED,  LATITUDE, LONGTITUDE,                                                                                             META_DESCRIPTION,                                                                                                                                                                                   OPENING_HOURS,        ICON_NAME, ADDRESSPOSTALCODE 
min values  : Adventure Cove Waterpark™ Singapore, A feat of engineering, an architectural statement and a sheer aesthetic triumph, Marina Bay Sands<sup>®</sup> has upped the ante for buildings in Singapore.,                       1 Beach Road,                   http://acm.org.sg/,                                  /content/dam/desktop/global/see-do-singapore/architecture/hajjah-fatimah-mosque-carousel01-rect.jpg, www.yoursingapore.com/en/see-do-singapore/architecture/historical/capitol-building-singapore.html,               Adults and kids of all ages who are not even science buffs will have fun at the Singapore Science Centre.,                   8Q at SAM, 2015-03-30T12:57:27.648+08:00, 1.2230965,  103.68398, A tranquil patch of imperial China in the west of Singapore is pleasant respite from the bustle of the city.,                                                                                                                                                       50th storey Skybridge, Daily, 9am –10pm, tourist_spot.gif,                 0 
max values  :            Victoria Theatre Singapore,                                                   With so many attractions packed into this 15-km stretch of beaches, you’ll never run out of things to do., Seng Poh Road and Tiong Bahru Road, https://www.pub.gov.sg/marinabarrage, www.yoursingapore.com/content/dam/desktop/global/see-do-singapore/recreation-leisure/universal-studios-singapore-carousel01-rect.jpg,      www.yoursingapore.com/en/see-do-singapore/recreation-leisure/viewpoints/singapore-flyer.html, Whether you prefer water sports, rollerblading or cycling, find a myriad of things to do at East Coast Park, Singapore., Wildlife Reserves Singapore, 2015-11-03T17:55:41.364+08:00,   1.44672,  103.97403,                                     With the Henderson Waves bridge, form meets function to stunning effect., Visits are by appointment only.Visitors must sign up in advance for heritage tours which fall on:Monday, 2pm – 3pm,Tuesday, 6.30pm – 7.30pm,Thursday, 10am – 11am,Saturday, 11am – 12pm, tourist_spot.gif,                 0 

Step 2: Converting Spatial class into sp format

As spatstat requires the data in ppp format and there is no direct way to convert Spatial class into ppp, we need to first convert the data into Spatial object.

airbnb_sp <- as(airbnb_spatial, "SpatialPoints")
hotels_sp <- as(hotels_spatial, "SpatialPoints")
tourism_sp <- as(tourism_spatial, "SpatialPoints")
class       : SpatialPoints 
features    : 8293 
extent      : 7215.566, 44098.31, 25166.35, 49226.35  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
class       : SpatialPoints 
features    : 422 
extent      : 5939.241, 45334.18, 25379.44, 44562.4  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
class       : SpatialPoints 
features    : 106 
extent      : 11380.23, 43659.54, 22869.34, 47596.73  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 

Step 3: Converting sp format into spatstat’s ppp format

Now that the datasets are in sp format, we can finally convert the datasets into ppp object format.

airbnb_ppp <- as(airbnb_sp, "ppp")
hotels_ppp <- as(hotels_sp, "ppp")
tourism_ppp <- as(tourism_sp, "ppp")
Planar point pattern: 8293 points
window: rectangle = [7215.57, 44098.31] x [25166.35, 49226.35] units
Planar point pattern: 422 points
window: rectangle = [5939.24, 45334.18] x [25379.44, 44562.4] units
Planar point pattern: 106 points
window: rectangle = [11380.23, 43659.54] x [22869.34, 47596.73] units

Handling duplicated points

Before we can proceed, we need to check to see if the data contains any duplicated points. We can do so by using the any(duplicated()) function.

any(duplicated(airbnb_ppp))
[1] TRUE
any(duplicated(hotels_ppp))
[1] TRUE
any(duplicated(tourism_ppp))
[1] TRUE

From the above results, we can tell that all 3 datasets contain duplicated points. Therefore, we need to properly handle them before moving on. We can use jittering, which is a solution that adds a small perturbation to the duplicated points, ensuring that the points do not occupy the same space. We can use this solution by using the following code:

airbnb_ppp_jit <- rjitter(airbnb_ppp, 
                             retry=TRUE, 
                             nsim=1, 
                             drop=TRUE)

hotels_ppp_jit <- rjitter(hotels_ppp, 
                             retry=TRUE, 
                             nsim=1, 
                             drop=TRUE)

tourism_ppp_jit <- rjitter(tourism_ppp, 
                             retry=TRUE, 
                             nsim=1, 
                             drop=TRUE)

After running the code above, let’s check to see if there are any duplicated points left.

any(duplicated(airbnb_ppp_jit))
[1] FALSE
any(duplicated(hotels_ppp_jit))
[1] FALSE
any(duplicated(tourism_ppp_jit))
[1] FALSE